\subsection{Basic properties}
We have a notion of `size' for finite sets.
Is there such an analogous notion for infinite sets?
We will say that a set \(X\) is countable if \(X\) is finite, or it bijects with \(\mathbb N\).
Equivalently, we can list out the elements of the set, and each element will appear in the list.
Here are some examples.
\begin{enumerate}
	\item Clearly any finite set is countable.
	\item \(\mathbb N\) is countable.
	\item \(\mathbb Z\) is countable, let us construct the list of numbers
	      \[
		      0, 1, -1, 2, -2, 3, -3, 4, -4, \dots
	      \]
\end{enumerate}
It makes sense now to consider two sets to have the same size if they biject with each other.
\begin{proposition}
	A set \(X\) is countable if and only if it injects into \(\mathbb N\).
\end{proposition}
\begin{proof}
	The forward implication is trivial: if \(X\) is finite, then there must be an injection in to \(\mathbb N\), and if it bijects with \(\mathbb N\) then that bijection is a valid injection.
	This encompasses both cases of countable sets.

	Now let us consider the reverse implication.
	We may assume \(X\) is infinite, since if \(X\) is finite then by definition \(X\) is countable.
	We know that \(X\) injects onto \(\mathbb N\) under some injective function \(f\), so \(X\) bijects with \(\Im f\).
	So it is enough to show that the image \(\Im f\) is countable.
	We will now set \(a_1\) to be the least element of \(\Im f\), and \(a_2\) to be the least element not equal to \(a_1\), and so on.
	In general, \(a_n = \min (\Im f \setminus \{ a_i : 0 \leq i < n \})\).
	Then \(\Im f\) is the set \(\{ a_1, a_2, \dots \}\).
	There are no extra elements that we have not covered, since each \(a \in X\) is \(a_n\) for some \(n\), because \(a=a_n, n \leq a\).
	So we have listed elements of \(\Im f\), so \(\Im f\) is countable, so \(X\) is countable.
\end{proof}
Thus, we can view countability as being `at most as large as \(\mathbb N\)'.
For instance, any subset of a countable set is also countable.

\begin{remark}
	In \(\mathbb R\), let
	\[
		X = \left\{ \frac{1}{2}, \frac{2}{3}, \frac{3}{4}, \dots \right\} \cup \{ 1 \}
	\]
	Then \(X\) is countable, as we can list it as
	\[
		1, \frac{1}{2}, \frac{2}{3}, \frac{3}{4}, \dots
	\]
	But if we counted from `least element' to `most element', we would never reach the element 1 in countable time.
	Note further that if we find it difficult to construct a list for a set, it does not mean it is uncountable, it could just mean that we haven't found the right list yet.
\end{remark}

\subsection{Products of countable sets}
\begin{theorem}
	\(\mathbb N \times \mathbb N\) is countable.
\end{theorem}
\begin{proof}[Proof 1]
	We will define \(a_1 = (1, 1)\), and inductively define
	\[
		a_n = \begin{cases}
			(p-1, q+1) & \text{if } p > 1 \\
			(q+1, 1)   & \text{if } p = 1
		\end{cases}
	\]
	where \(a_{n-1} = (p, q)\).
	Therefore, we are essentially moving across antidiagonals of the plane.
	This does hit every point \((x, y) \in \mathbb N \times \mathbb N\), for example by induction on \(x+y\), so we have listed all elements of \(\mathbb N \times \mathbb N\).
\end{proof}
\begin{proof}[Proof 2]
	If we can define an injective function \(\mathbb N \times \mathbb N \to \mathbb N\), then it is countable.
	For example, let \(f = 2^x 3^y\).
	\(f\) is injective, so \(\mathbb N \times \mathbb N\) is countable.
\end{proof}

\subsection{Countable unions of countable sets}
Proof 2 is also a way to show the following theorem:
\begin{theorem}
	Let \(A_1, A_2, A_3, \dots\) be countable sets.
	Then \(A_1 \cup A_2 \cup A_3 \cup \dots\) is countable.
	Less formally, `a countable union of countable sets is countable'.
\end{theorem}
\begin{proof}
	For each \(i\), \(A_i\) is countable, so we can list \(A_i\) as \(a_{i1}, a_{i2}, a_{i3}, \dots\) which may or may not terminate.
	We can then define
	\[
		f\colon \bigcup_{n \in \mathbb N}A_n \to \mathbb N;\quad f(x) = 2^i 3^j
	\]
	where \(x = a_{ij}\).
	If \(x\) is in more than one set, just take the least \(i\) that is valid.
	Then \(f\) is an injection so the union is countable.
\end{proof}
Here are some examples of using this theorem by partitioning sets as a countable union of countable subsets.
\begin{enumerate}
	\item \(\mathbb Q\) is countable, since it is a countable union of countable sets:
	      \[
		      \mathbb Q = \mathbb Z \cup \frac{1}{2}\mathbb Z \cup \frac{1}{3}\mathbb Z \cup \dots
	      \]
	      Each \(\frac{1}{n}\mathbb Z\) is countable, since they biject with \(\mathbb Z\) which is a countable set.
	      It doesn't matter if we've counted an element in \(\mathbb Q\) twice; the above theorem works even with intersecting sets.
	\item The set \(\mathbb A\) of all algebraic numbers is countable.
	      It is enough to show that the set of integer polynomials is countable, since each polynomial has a finite amount of roots and then \(\mathbb A\) is a countable union of finite sets.
	      Now, to show that the set of integer polynomials is countable, it is enough to show that for each degree \(d\) it is countable, since it is a countable union of all polynomials of degree \(d\) (again using the above theorem).
	      To specify a polynomial of degree \(d\) you must name its coefficients, so this set injects into \(\mathbb Z^{d+1}\), so we must just show that \(\mathbb Z^{d+1}\) is countable (not a bijection since the first term of the polynomial must be nonzero).
	      We know that \(\mathbb Z^n\) is countable because we can inductively show that \(\mathbb Z^2, \mathbb Z^3, \mathbb Z^4, \dots\) are countable inductively.
\end{enumerate}

\subsection{Uncountable sets}
\begin{definition}
	A set is uncountable if there is no way to count the set.
\end{definition}
\begin{theorem}
	\(\mathbb R\) is uncountable.
\end{theorem}
\begin{proof}[Proof (Cantor's Diagonal Argument)]
	We will show that \((0, 1)\) is uncountable, then clearly \(\mathbb R\) is uncountable.
	Suppose \((0, 1)\) is countable.
	Then given a sequence \(r_1, r_2, \dots\) in \((0, 1)\), we just need to find some number \(s \in (0, 1)\) not contained within this sequence.
	For each \(r_n\), we have a decimal expansion \(r_n = 0.r_{n1}r_{n2}r_{n3}\dots\).
	Let us now write all of these numbers in a matrix-style form:
	\begin{align*}
		r_1 & = 0.r_{11}r_{12}r_{13}\dots \\
		r_2 & = 0.r_{21}r_{22}r_{23}\dots \\
		r_3 & = 0.r_{31}r_{32}r_{33}\dots \\
		\vdots
	\end{align*}
	We just need to construct some number \(s\) that is not in this list.
	So, let us simply make sure that for any given \(r\) value, there is at least one digit that does not match.
	The easiest way to construct such a number is
	\[
		s = 0.s_1 s_2 s_3 \dots
	\]
	where \(s_1 \neq r_{11}\), \(s_2 \neq r_{22}\), \(s_3 \neq r_{33}\) and so on.
	We can pick any numbers we like according to these constraints, but we should avoid picking digits 0 and 9 since \(0.1000\dots = 0.0999\dots\) for example, which could cause unnecessary ambiguity.
	Then \(s \neq r_1, s \neq r_2, \dots\) since there is at least one mismatched digit in the expansion for each \(r_i\); they differ in decimal digit \(i\).
	So \(\mathbb R\) is uncountable.
\end{proof}
This is another proof that transcendental numbers exist.
\(\mathbb R\) is uncountable and \(\mathbb A\) is countable, so there exists a transcendental number.
Indeed, `most' numbers are transcendental, i.e.\ \(\mathbb R \setminus \mathbb A\) is uncountable (because if \(\mathbb R \setminus \mathbb A\) were countable, then \(\mathbb R\) would be \((\mathbb R \setminus \mathbb A) \cup \mathbb A\) which is a finite union of countable sets \contradiction).
\begin{theorem}
	The power set \(\mathcal P(\mathbb N)\) is uncountable.
\end{theorem}
\begin{proof}
	Suppose the subsets of \(\mathbb N\) are listed as \(S_1, S_2, S_3, \dots\) then we want to construct another set \(S\) that is not equal to any of the other sets \(S_i\).
	So for each set \(S_i\), we must ensure that \(S\) and \(S_i\) differ for at least one value.
	An easy way to do this is to include the number \(i\) in the subset if \(S_i\) does not contain the number, and to exclude \(i\) if \(i \in S_i\).
	Then \(S\) differs from \(S_i\) at position \(i\).
	This is the same logic as the diagonal argument above.
	We have:
	\[
		S = \{ n \in \mathbb N : n \notin S_n \}
	\]
	So \(S\) is not on the list \(S_1, S_2, S_3, \dots\) no matter what way we choose to list the elements, so \(\mathcal P(\mathbb N)\) is uncountable.
\end{proof}
\begin{remark}
	Alternatively, we could just inject \((0, 1)\) into \(\mathcal P(\mathbb N)\).
	For example, consider \(x \in (0, 1)\) represented as \(0.x_1x_2x_3x_4\dots\) in binary where the \(x_1, x_2, \dots\) are zero or one (not ending with an infinite amount of 1s).
	We can convert this \(x\) into a subset of \(\mathbb N\) by considering the set \(\{ n \in \mathbb N : x_n = 1 \}\).
	Then the uncountability follows.
\end{remark}
In fact, our proof of this theorem shows the following.
\begin{theorem}
	For any set \(X\), there is no bijection from \(X\) to the power set \(\mathcal P(X)\).
\end{theorem}
For example, \(\mathbb R\) does not biject with \(\mathcal P(\mathbb R)\).
The proof in fact will show that there is no surjection from \(X\) to its power set; i.e.\ the power set is `larger' than \(X\).
\begin{proof}
	Given any function \(f\colon X \to \mathcal P(X)\), we will show \(f\) is not surjective.
	Let \(S = \{ x \in X: x \notin f(x) \}\).
	Then \(S\) does not belong to the image of \(f\) because they differ at element \(x\); for all \(x\) we have \(S \neq f(x)\).
\end{proof}
\begin{remark}
	Note that:
	\begin{enumerate}
		\item This is similar in some sense to Russell's paradox.
		\item This theorem gives another proof that there is no universal set \(\mathscr E\), since its power set \(\mathcal P(\mathscr E) \subseteq \mathscr E\).
		      But of course, there is always a surjection from a set to a subset.
		      This is a contradiction.
	\end{enumerate}
\end{remark}
\begin{example}
	Let \(A_i, i \in I\) be a family of open, pairwise disjoint intervals.
	Must this family be countable?
	Note that it is not as simple as just listing from left to right, for example consider
	\[
		\left(\frac{1}{2}, 1\right), \left(\frac{1}{3}, \frac{1}{2}\right), \left(\frac{1}{4}, \frac{1}{3}\right), \dots, (-1, 0)
	\]
	Then the leftmost interval is \((-1, 0)\), but there is no `next interval' just after it.
	Also consider
	\[
		\left( 0, \frac{1}{2} \right), \left( \frac{1}{2}, \frac{2}{3} \right), \left( \frac{2}{3}, \frac{3}{4} \right), \dots, (1, 2)
	\]
	Then we can list the first infinitely many intervals, but we will never reach \((1, 2)\).
	The answer turns out to be true; the family is countable.
	Here are two important proofs.
	\begin{proof}[Proof 1]
		Each interval \(A_i\) contains a rational number \(a_i\).
		The rationals \(\mathbb Q\) are countable.
		So let us just list the \(a_i\).
		The family is therefore countable.
	\end{proof}
	\begin{proof}[Proof 2]
		\(\{ i \in I: A_i \text{ has length } \leq 1\}\) is certainly countable, since it injects into \(\mathbb Z\) (here, as all \(A_i\) contain some integer).
		Further, \(\left\{ i \in I: A_i \text{ has length } \leq \frac{1}{2} \right\}\) is countable for the same reason.
		Essentially, for all \(n\), \(\left\{ i \in I: A_i \text{ has length } \leq \frac{1}{n} \right\}\) is countable.
		We have written all the intervals as a countable union (over \(n\)) of countable sets.
	\end{proof}
\end{example}

To summarise, if we want to show a set \(X\) is uncountable:
\begin{enumerate}
	\item Run a diagonal argument; or
	\item Inject an uncountable set into \(X\)
\end{enumerate}
To show a set \(X\) is countable:
\begin{enumerate}
	\item List all the elements (usually fiddly); or
	\item Inject \(X\) into \(\mathbb N\) (or another countable set); or
	\item Express \(X\) as a countable union of countable sets (usually the best); or
	\item If \(X\) is `in' or `near' \(\mathbb R\), consider \(\mathbb Q\) (see Proof 2 above).
\end{enumerate}

\subsection{Comparing sizes of sets}
Intuitively, we might think that:
\begin{itemize}
	\item `\(A\) bijects with \(B\)' means `\(A\) has the same size as \(B\)'.
	\item `\(A\) injects into \(B\)' means `\(A\) is at most as large as \(B\)'.
	\item `\(A\) surjects onto \(B\)' means `\(A\) is at least as large as \(B\)'.
\end{itemize}
Of course, these analogies break down where \(B\) is zero, since there are no functions from \(A\) to \(B\) in this case.
For these to make sense, we require (for \(A, B\neq\varnothing\)) `\(A\) injects into \(B\)' to be true if and only if `\(B\) surjects onto \(A\)', and vice versa.
\begin{itemize}
	\item In the forward direction, we are given an injection \(f\colon A \to B\).
	      Pick some point \(a_0\) in \(A\), and define a surjective function \(g\colon B \to A\) given by
	      \[
		      b \mapsto \begin{cases}
			      a   & \text{if } \exists!\ a \in A, f(a) = b \\
			      a_0 & \text{otherwise}
		      \end{cases}
	      \]
	      Since the mapping \(f\) is injective, the first case will always provide a unique value of \(a\).
	\item Proving the converse, we are given a surjection \(g\colon B \to A\).
	      For each \(a\) in \(A\), we have some \(a' \in B\) with \(g(a') = a\) since \(g\) is a surjection.
	      Let \(f(a) = a'\) for each \(a\in A\), and \(f\) is injective.
\end{itemize}

\subsection{Schr\"oder--Bernstein theorem}
Further, we must also have that if `\(A\) is at most as large as \(B\)' and `\(B\) is at most as large as \(A\)', then they must be the same size.
Otherwise this size intuition would not make sense.
\begin{theorem}[Schr\"oder--Bernstein Theorem]
	If \(f\colon A\to B\) and \(g\colon B\to A\) are injections, then there exists a bijection \(h\colon A\to B\).
\end{theorem}
\begin{proof}
	For \(a\in A\), we will write \(g^{-1}(a)\) to denote the unique \(b \in B\) such that \(g(b) = a\), if such a \(b\) exists (and similarly for \(f^{-1}(b)\)).
	The `ancestor sequence' of \(a \in A\) is
	\[g^{-1}(a), f^{-1}g^{-1}(a), g^{-1}f^{-1}g^{-1}(a), \dots\]
	which may terminate.
	So for any ancestor, after undergoing the relevant function \(f\) or \(g\) repeatedly, we will end up at \(a\).
	There are three possible behaviours:
	\begin{itemize}
		\item Let \(A_0\) be the subset of \(A\) such that the ancestor sequence stops at even time, i.e.\ the last ancestor is in \(A\);
		\item Let \(A_1\) be the subset of \(A\) such that the ancestor sequence stops at odd time, i.e.\ the last ancestor is in \(B\); and
		\item Let \(A_\infty\) be the subset of \(A\) such that the ancestor sequence does not terminate.
	\end{itemize}
	We specify 0 to be even, i.e.\ if \(a\in A\) has no ancestor \(g^{-1}(a)\), then \(a \in A_0\).
	We define similar subsets of \(B\): \(B_0\), \(B_1\), \(B_\infty\).
	Now:
	\begin{itemize}
		\item \(f\colon A \to B\) is a bijection between \(A_0\) and \(B_1\).
		      Clearly if some element \(a\) has an even number of ancestors, the ancestors of \(f(a)\) are exactly \(a\) and all of its ancestors, i.e.\ an odd number.
		      It is surjective because every element in \(B_1\) has an inverse \(f^{-1}(b) \in A_0\) by construction.
		\item \(g\colon B \to A\) is a bijection between \(B_0\) and \(A_1\) due to the same argument.
		\item \(f\) (or \(g\), both functions work for this proof) bijects \(A_\infty\) and \(B_\infty\).
		      It is surjective because for every element \(b \in B\), it has some ancestor \(f^{-1}(b) \in A_\infty\).
	\end{itemize}
	So the function \(h\colon A \to B\) is given by
	\[
		h(a) = \begin{cases}
			f(a)      & \text{if } a \in A_0      \\
			g^{-1}(a) & \text{if } a \in A_1      \\
			f(a)      & \text{if } a \in A_\infty
		\end{cases}
	\]
	is a bijection.
\end{proof}
Let us consider an example of this theorem in action.
Do \([0, 1]\) and \([0,1]\cup[2,3]\) biject?
All we need is to find an injection both ways.
\begin{itemize}
	\item Let \(f\colon [0,1] \to [0,1] \cup [2,3]\) be the identity map \(f(x) = x\).
	\item Let \(g\colon [0,1] \cup [2,3] \to [0,1]\) be given by \(g(x) = x/3\).
\end{itemize}

It would also be nice to have that, for any sets \(A\) and \(B\), either \(A\) injects into \(B\) or \(B\) injects into \(A\).
Then we can create a total ordering, rather than a partial ordering; we can compare any two sets.
This is proven to be true in the Part II course Logic and Set Theory.

\subsection{Arbitrarily large sets}
We have the sets
\[
	\mathbb N, \mathcal P(\mathbb N), \mathcal P(\mathcal P(\mathbb N)), \dots, \mathcal P^k(\mathbb N), \dots
\]
Does every set \(X\) inject into one of those?
It seems like this might be true, but the set
\[
	X = \mathbb N \cup \mathcal P(\mathbb N) \cup \mathcal P(\mathcal P(\mathbb N)) \cup \dots
\]
is a counterexample.
Let us continue further with this approach.
\[
	X' = X \cup \mathcal P(X) \cup \mathcal P(\mathcal P(X)) \cup \dots
\]
\[
	X'' = X' \cup \mathcal P(X') \cup \mathcal P(\mathcal P(X')) \cup \dots
\]
and so on.
Now, does every set inject into one of these sets?
No, consider
\[
	Y = X \cup X' \cup X'' \cup X''' \cup \dots
\]
We can keep going forever.
So we can't construct a set that all sets inject into.

\subsection{What happens next?}
This is the end of the Numbers and Sets course.
Here are a few of the courses that feed from this course.
\begin{itemize}
	\item Factorisation is taken further in the IB Groups, Rings and Modules course.
	\item Fermat's Little Theorem, squares modulo \(p\) etc.\ are taken further in II Number Theory.
	\item The analysis chapter is extended by IA Analysis.
	\item Countability and sizes of sets are taken further in the II Logic and Set Theory course.
\end{itemize}
